-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarks from HN blog post #260
Conversation
@nirvdrum you can probably get a 10-20x speedup on these with TruffleRuby :) |
@eregon It appears that TruffleRuby no longer works correctly with cool.io https://github.com/Shopify/yjit-bench/actions/runs/7426843741/job/20211367451. It doesn't seem like the problem of yjit-bench but TruffleRuby's. Could you fix this issue? |
With |
Ah, so we expected a left shift to not have a dynamic amount, and it did in this method. The amount can be as large as 11, so if we edit: I tried chain-guarding 11 times, but falling back immediately after the first chain was faster. |
Useful but not mandatory: deterministic solvers like this are great for doing a quick check after the run_benchmark() loop to make sure we're getting the right result (after is better than before -- we want to make sure we're checking the answer with JITted code). That way if something starts returning the wrong answer we'll know really quickly. |
Seeing this now. Thanks, I'll take a look. Let's track it in #262 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a quick look at the diff it's rather clear the Ruby code there is not idiomatic and formatted rather in an unusual way. Which is expected given https://github.com/attractivechaos/plb2's As I am mostly a C programmer, implementations in other languages may be suboptimal and there are no implementations in functional languages.
.
While it looks like mostly style differences, there are also performance implications for for
vs each
and other loop methods.
I think it would be worth improving these benchmarks, and also try to upstream that, so it looks like regular/typical Ruby code, and not C translated to Ruby, which at least looks unrepresentative of typical Ruby code.
My guess is after small tweaks it's actually reasonable code, but right now it's hard because even the indentation is unusual and displays very poorly (at least on GitHub).
|
||
def matgen(n) | ||
tmp = 1.0 / n / n | ||
a = Array.new(n) { Array.new(n) { 0 } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of something suboptimal and somewhat not idiomatic, Array.new(n, 0)
is much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whereas I looked at that and thought, "oh, sometimes I see people use the default value like this to avoid having to do early initialization, I'm glad we're getting a benchmark for it".
But you're right, it's the exception and not the rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The outer Array.new
needs to use a block, so that form is getting benchmarked anyway :)
I took a stab at cleaning up them up, which got accepted: https://github.com/attractivechaos/plb2/tree/bbff5fba14b7ad55eac1ec1faa6f3db8798bcc70/src/ruby Timings improved for matmul and sudoko:
|
Nice! |
Feel free to open a PR to update the benchmarks here as well. |
I did it in #268 |
Thanks Benoit |
The other day I saw this HN blog post about benchmarking 20 programming languages, in which they had benchmarked Python and Ruby (with YJIT!) among many others. The good news is that Ruby with YJIT greatly outperformed Python. The less good news is that we're still far behind other languages including PHP.
I asked the author for permission to include the Ruby benchmarks that he used. These are synthetic with big loops. I think they could be useful to benchmark changes we make to the register allocator. It would also be interesting to see if we can make performance on the matrix multiplication benchmark go up significantly by implementing floating point addition and multiplication.
It's also interesting that we see zero speed boost on
nqueens
even though--yjit-stats
reports zero exits. Suggests there is an exit that we're not catching/counting somewhere @k0kubun @XrXr